Improved regret for zeroth-order adversarial bandit convex optimisation
نویسندگان
چکیده
منابع مشابه
Improved Regret Guarantees for Online Smooth Convex Optimization with Bandit Feedback
The study of online convex optimization in the bandit setting was initiated by Kleinberg (2004) and Flaxman et al. (2005). Such a setting models a decision maker that has to make decisions in the face of adversarially chosen convex loss functions. Moreover, the only information the decision maker receives are the losses. The identities of the loss functions themselves are not revealed. In this ...
متن کاملBandit Convex Optimization: √ T Regret in One Dimension
We analyze the minimax regret of the adversarial bandit convex optimization problem. Focusing on the one-dimensional case, we prove that the minimax regret is Θ̃( √ T ) and partially resolve a decade-old open problem. Our analysis is non-constructive, as we do not present a concrete algorithm that attains this regret rate. Instead, we use minimax duality to reduce the problem to a Bayesian setti...
متن کاملBandit Convex Optimization: \(\sqrt{T}\) Regret in One Dimension
We analyze the minimax regret of the adversarial bandit convex optimization problem. Focusing on the one-dimensional case, we prove that the minimax regret is Θ̃( √ T ) and partially resolve a decade-old open problem. Our analysis is non-constructive, as we do not present a concrete algorithm that attains this regret rate. Instead, we use minimax duality to reduce the problem to a Bayesian setti...
متن کاملImproved Regret Bounds for Oracle-Based Adversarial Contextual Bandits
We give an oracle-based algorithm for the adversarial contextual bandit problem, where either contexts are drawn i.i.d. or the sequence of contexts is known a priori, but where the losses are picked adversarially. Our algorithm is computationally efficient, assuming access to an offline optimization oracle, and enjoys a regret of order O((KT ) 2 3 (logN) 1 3 ), where K is the number of actions,...
متن کاملOn Zeroth-Order Stochastic Convex Optimization via Random Walks
We propose a method for zeroth order stochastic convex optimization that attains the suboptimality rate of Õ(n7T−1/2) after T queries for a convex bounded function f : R → R. The method is based on a random walk (the Ball Walk) on the epigraph of the function. The randomized approach circumvents the problem of gradient estimation, and appears to be less sensitive to noisy function evaluations c...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Mathematical Statistics and Learning
سال: 2020
ISSN: 2520-2316
DOI: 10.4171/msl/17